Fixed sound source vector generation method and fixed sound source codebook

ABSTRACT

At the speech encoding end, upon generation of an fixed excitation vector, the shape of an excitation vector output from pulse excitation codebook  301  is identified in pulse excitation vector shape identifier  302 , a dispersion vector used for excitation vectors of the shape is output from dispersion vector storage  304 , and, in dispersion vector convolution processor  303 , dispersion vector convolution processing of the excitation vector is performed. In particular, when a pulse excitation vector having a specific shape of high frequency of use is output from pulse excitation codebook  301 , pulse excitation vector shape identifier  302  controls dispersion vector storage  304  in such a way that an additional dispersion vector prepared dedicated to the pulse excitation vector is output. By this means, it is possible to provide a technology that improves the quality of decoded speech and that decodes speech more natural and audible to the user.

TECHNICAL FIELD

The present invention relates to a fixed excitation vector generationmethod and a fixed excitation codebook for use in a CELP type speechencoder or a CELP type speech decoder.

BACKGROUND ART

In such fields as digital communication, packet communication typifiedby Internet communication, and speech storage, speech signal encodersare used to compress speech information so as to make efficient use ofradio wave transmission path capacity and storage media and thusencoding at high efficiency.

Among these, methods based on the CELP (Code Excited Linear Prediction)method are widely used at intermediate and low rates in practice. A CELPtechnique that uses pulse excitation as a drive excitation signal isdescribed in “Code-Excited Linear Prediction (CELP): High-quality Speechat Very Low Bit Rates” by M. R. Schroeder and B. S Atal, Proc.ICASSP-85, 25.1.1., pp.937-940, 1985.

In a CELP type speech encoding method, a digitized speech signal isdivided into frames of a fixed frame length (approximately 5 ms-50 ms),linear prediction of speech is performed on a per frame basis, andlinear prediction residual (excitation signal) from the linearprediction performed on a per frame basis is encoded using an adaptivecodebook and a fixed codebook (including a stochastic codebook, randomcodebook, noise codebook and so on) composed of known waveforms.

The adaptive codebook holds drive excitation signals generated in thepast and is used to represent a cyclic component of a speech signal. Thefixed codebook holds a predetermined number of vectors, provided inadvance and having predetermined shapes, and is chiefly used torepresent a non-cyclic component that cannot be represented with theadaptive codebook.

As for the vectors stored in the fixed codebook, vectors composed ofrandom noise sequence and/or vectors represented by combining a numberof pulses are used.

A typical example of a fixed codebook that represents a vector bycombining a number of pulses is the algebraic fixed codebook. Thealgebraic fixed codebook is described in detail, for example, in ITU-TRecommendation G.729 Annex-D. The algebraic fixed codebook has theadvantage of searching a fixed excitation codebook at a smallcomputation amount and reducing the capacity in ROM that holdsexcitation vectors. Still, the problem regarding difficulty of accuratecode representation of a noise component persists.

One method for solving this problem with the algebraic fixed codebook isthe technique of using a pulse dispersiondispersion technique. Pulsedispersiondispersion is disclosed in ITU-T Recommendation G.729 Annex-D.This pulse dispersiondispersion is a method for generating a fixedexcitation vector by convoluting a dispersiondispersion pattern (fixedwaveform) in an excitation vector.

FIG. 1 is a block diagram showing an example of configuration of a fixedexcitation codebook having a conventional pulse dispersiondispersionstructure. dispersiondispersed pulse codebook 10 comprises pulseexcitation codebook 11, dispersiondispersion vector convolutionprocessor 12, and dispersiondispersion vector storage 13.

An excitation vector is output from pulse excitation codebook 11, and adispersiondispersion vector, taken from dispersiondispersion vectorstorage 13, is convoluted with this pulse excitation vector indispersion vector convolution processor 12, thereby generating a fixedexcitation vector (noise excitation vector).

It is possible to improve the performance of the pulse excitationcodebook at low bit rates such as below 4 kbit/s by conventional pulsedispersion.

Still, greater quality improvement (that is, further improving thequality of decoded speech) will be required in next-generation mobiletelephone systems, and it is difficult to meet such demand with existingtechnologies.

For instance, simply increasing the patterns of dispersion vectors doesnot improve the quality of decoded speech, and increasing the patternsof dispersion vectors thus has the threat of increasing the capacity ina memory and making signal processing complex.

DISCLOSURE OF INVENTION

It is therefore an object of the present invention to provide atechnique that further enhances the quality of decoded speech byimproving the quality of speech at the encoding end and the decoding endof speech, and that decodes speech more natural and audible to the user.

The above object is achieved, when a fixed excitation vector isgenerated at the speech encoding end, by selecting in advance a pulseexcitation vector of a specific shape with high frequency of use fromamong many pulse excitation vectors, and preparing a dedicateddispersion vector corresponding to the selected pulse excitation vector.

In addition, the above object is achieved by, at the speech decodingend, applying high-frequency emphasis processing of novel and ingeniouscharacteristics to an excitation signal (a signal that imitates speechthat originates in man's vocal tract) before being input to a synthesisfilter (having functions that imitate man's vocal tract).

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram showing an example of configuration of a fixedexcitation codebook having conventional pulse dispersion mechanism;

FIG. 2 is a drawing showing a simplified overall configuration of aspeech signal transmitting apparatus and a speech signal receivingapparatus according to the present invention;

FIG. 3 is a block diagram showing a configuration of a speech encoderaccording to the first embodiment of the present invention;

FIG. 4 is a block diagram showing a configuration of a fixed excitationcodebook according to the first embodiment of the present invention;

FIG. 5A is a drawing showing the distribution of the frequency of use ofpulse excitation vectors according to the first embodiment of thepresent invention;

FIG. 5B is a drawing showing the distribution of the frequency of use ofpulse excitation vectors according to the first embodiment of thepresent invention;

FIG. 6 is a drawing showing an example of an additional dispersionvector according to the first embodiment of the present invention;

FIG. 7 is a drawing showing an example of an additional dispersionvector according to the first embodiment of the present invention;

FIG. 8 is a drawing showing an example of an additional dispersionvector according to the first embodiment of the present invention;

FIG. 9 is a drawing showing an example of an additional dispersionvector according to the first embodiment of the present invention;

FIG. 10 is a drawing showing an example of an additional dispersionvector according to the first embodiment of the present invention;

FIG. 11 is a drawing showing an example of an additional dispersionvector according to the first embodiment of the present invention;

FIG. 12 is a drawing describing the detail of selection processing in adispersion vector storage according to the first embodiment of thepresent invention;

FIG. 13 is a flowchart showing the steps of processing in a fixedexcitation codebook according to the first embodiment of the presentembodiment;

FIG. 14 is a block diagram showing another configuration of a fixedexcitation codebook according to the first embodiment of the presentinvention;

FIG. 15 is a block diagram showing the steps of processing for searchinga fixed excitation codebook according to the first embodiment of thepresent invention;

FIG. 16 is a block diagram showing a configuration of a speech decoderaccording to the second embodiment of the present invention; and

FIG. 17 is a block diagram showing a configuration of a high-rangeamplifying section according to the second embodiment of the presentinvention.

BEST MODE FOR CARRYING OUT THE INVENTION

With reference now to the accompanying drawings, embodiments of thepresent invention will be explained in detail below.

First, the overall configuration of a sound signal transmittingapparatus and a sound signal receiving apparatus of the presentinvention will be explained with reference to FIG. 2.

In FIG. 2, speech signal 101 is converted to an electrical signal byinput apparatus 102, and is then output to A/D converter 103. A/Dconverter 103 converts the (analog) signal output from input apparatus102 to a digital signal, and outputs this signal to speech encoder 104.Speech encoder 104 encodes the digital speech signal output from A/Dconverter 103 using a speech encoding method described later herein, andoutputs encoded information to RF modulator 105. RF modulator 105 placesthe speech encoded information output from speech encoder 104 on apropagation medium such as a radio wave, converts the signal forsending, and outputs it to transmitting antenna 106. Transmittingantenna 106 sends out the output signal output from RF modulator 105 asa radio wave (RF signal). RF signal 107 in the drawing is a radio wave(RF signal) transmitted from transmitting antenna 106. The above is theconfiguration and operation of the speech signal transmitting apparatus.

RF signal 108 is received by receiving antenna 109 and output to RFdemodulator 110. RF signal 108 in the drawing is a radio wave asreceived by receiving antenna 109 and, if there is no signal attenuationor noise superimposition in the propagation path, is exactly the same asRF signal 107.

RF demodulator 110 demodulates speech encoded information from the RFsignal output from receiving antenna 109, and outputs this informationto speech decoder 111. Speech decoder 111 decodes a speech signal fromthe speech encoded information output from RF demodulator 110 using aspeech decoding method described later herein, and outputs the resultingsignal to the D/A converter 112. D/A converter 112 converts the digitalspeech signal output from speech decoder 111 to an analog electricalsignal, and outputs this signal to output apparatus 113. Outputapparatus 113 converts the electrical signal to vibrations of the air,and outputs sound waves that are audible to the human ear. In thefigure, the reference number 114 indicates sound waves that are output.The above is the configuration and operation of the speech signalreceiving apparatus.

By providing at least one of the above-described kinds of speech signaltransmitting apparatus and receiving apparatus, it is possible toconfigure a base station apparatus and mobile terminal apparatus in amobile communication system.

Now, with reference to the drawings, improvement of generation of fixedexcitation vectors using dispersion vectors at the speech encoding end(First Embodiment) and high-frequency emphasis processing at the speechdecoding end (Second Embodiment) will be described in order.

First Embodiment

A case will be described here in the first embodiment where, in a fixedexcitation codebook, a dedicated dispersion vector is provided for apulse excitation vector of a predetermined shape, and an optimumdispersion vector is applied depending on the shape of the pulseexcitation vector.

FIG. 3 is a block diagram showing a configuration of speech decoder 104mounted in the speech signal transmitting apparatus of FIG. 2.

An input signal in speech encoder 104 is a signal output from A/Dconverter 103, and is input to preprocessing section 200. Preprocessingsection 200 performs high-pass filter processing that eliminates the DCcomponent in the input speech signal, or waveform shaping processing andpre-emphasis processing concerned with improving the performance oflater encoding processing, and outputs the processed speech signal (Xin)to LPC analysis section 201 and adder 204.

LPC analysis section 201 performs linear predictive analysis using Xin,and outputs the result of the analysis (linear predictive coefficient)to LPC quantization section 202. LPC quantization section 202 performsquantization processing of the linear predictive coefficients (LPC), andoutputs the quantized LPC to synthesis filter 203 while outputting codeL indicating the quantized LPC to multiplexing section 213.

Synthesis filter 203 generates a reconstructed signal byfilter-synthesizing a drive excitation output from adder 210, explainedlater herein, using LPC coefficients based on the quantized LPC, andoutputs the reconstructed signal to adder 204.

Adder 204 calculates an error signal for aforementioned Xin and theaforementioned reconstructed signal, and outputs this error signal toauditory weighting section 211. Auditory weighting section 211 performsauditory weighting on the error signal output from adder 204, calculatesdistortion between Xin and the reconstructed signal in the auditoryweighting domain, and outputs this distortion to parameter determinationsection 212.

Parameter determination section 212 selects an adaptive excitationvector, a fixed excitation vector, and a quantization gain that minimizethe above encoding distortion from adaptive excitation codebook 205,fixed excitation codebook 207 and quantization gain generation section206, and outputs adaptive excitation vector code (A), excitation gaincode (G) and fixed excitation vector code (F) that indicate the resultof the selection, to multiplexing section 213. In addition, when theshape of a pulse excitation vector selected in fixed excitation codebook207 is a predetermined specific shape, selection of the best dispersionvector is performed from the set of additional dispersion vectorsprepared for the specific shape vector. Parameter determination section212 checks whether there are dispersion vectors that minimizequantization error more than does the fundamental dispersion vector, andselects a dispersion vector that minimizes quantization error the mostfrom among the fundamental dispersion vector and the additionaldispersion vectors, and outputs a control signal indicating theselection result to fixed excitation codebook 207.

Adaptive excitation codebook 205 buffers drive excitation signals outputby adder 210 in the past, and, from the past drive excitation signalsamples specified by a signal output from parameter determinationsection 212, cuts one frame of samples as an adaptive excitation vectorand outputs this to multiplier 208.

Quantization gain generation section 206 outputs to multipliers 208 and209, respectively, an adaptive excitation gain and a fixed excitationgain specified by a signal output from parameter determination section212.

Fixed excitation codebook 207 outputs to multiplier 209 a fixedexcitation vector obtained by multiplying a dispersion vector upon apulse excitation vector that has the shape specified by a signal outputfrom parameter determination section 212. The configuration of thisfixed excitation codebook 207 is a major characteristic of the presentembodiment, and this characteristic part will be described later indetail.

Multiplier 208 multiplies a quantization adaptive excitation gain outputfrom quantization gain generation section 206 upon the adaptiveexcitation vector output from adaptive excitation codebook 205, andoutputs the result to adder 210.

Multiplier 209 multiplies the quantization adaptive excitation gainoutput from quantization gain generation section 206 upon the fixedexcitation vector output from fixed excitation codebook 207, and outputsthe result to adder 210.

Adder 210 has as inputs the adaptive excitation vector and the fixedexcitation vector after gain multiplication from multipliers 208 and209, respectively, performs vector-addition of them, and outputs a driveexcitation of the addition result to synthesis filter 203 and adaptiveexcitation codebook 205.

Multiplexing section 213 has as inputs code L indicating thequantization LPC from LPC quantization section 202, code A indicatingthe adaptive excitation vector, code F indicating the fixed excitationvector, and code G indicating the quantization gain, from parameterdetermination section 212, multiplexes these information, and outputsthem to the propagation path as encoded information.

The above explains each component part of speech encoder 104.

The detailed configuration and features of fixed excitation codebook 207will be explained next with reference to the drawings.

FIG. 4 is a block diagram showing a configuration of fixed excitationcodebook 207 of FIG. 3.

Referring to FIG. 4, pulse excitation codebook 301 outputs a pulseexcitation vector to pulse excitation vector shape identifier 302 anddispersion vector convolution processor 303, respectively.

Pulse excitation vector shape identifier 302 associates a predeterminedvector shape with parameters that specify this vector shape andmemorizes them in a memory. If the pulse excitation vector consists ofonly several pulses, the shape is determined based on the distancebetween the pulses (i.e., how many samples apart they are) and thepolarity relationship of the pulses (heteropolar or homopolar). In thepresent case, the distance between the pulses and the polarityrelationship of the pulses are the parameters.

Then, pulse excitation vector shape identifier 302 compares theparameters of the pulse excitation vector output from pulse excitationcodebook 301 and the parameters of each memorized vector shape, and,when for instance all the parameters match, judges that these vectorshave the same shape. If the pulse excitation vector consists of only afew pulses, pulse excitation vector shape identifier 302 judges thatthese vectors have the same shape, provided that they share the samerelative positions between the respective pulses and polarityrelationship. Moreover, vectors that have the same pulse intervals andpulse polarity and that are shifted in the time axis direction, andvectors that are multiplied by a constant number in scale (pulseamplitude) are also judged to be vectors of the same shape.

When there are vectors of the same shape, pulse excitation vector shapeidentifier 302 outputs a control signal to dispersion vector storage 304so as to output an additional dispersion vector designed exclusively forthe pulse excitation vectors of this shape. On the other hand, whenthere are no vectors of the same shape, pulse excitation vector shapeidentifier 302 outputs a control signal to dispersion vector storage 304so as to output a fundamental dispersion vector.

Dispersion vector storage 304 memorizes, besides the fundamentaldispersion vector used commonly for all pulse excitation vectors, anadditional dispersion vector used for pulse excitation vectors of apredetermined shape in a memory, and switches the dispersion vectorsoutput to dispersion vector convolution processor 303 in accordance withthe control signal from parameter determination section 212 and thecontrol signal from excitation vector shape identifier 302. That is,dispersion vector storage 304 selects the dispersion vector thatcorresponds to the pulse excitation vector shape identified in pulseexcitation vector shape identifier 302, and outputs it to dispersionvector convolution processor 303.

Dispersion vector convolution processor 303 convolutes the pulseexcitation vector output from pulse excitation codebook 301 and thedispersion vector taken from dispersion vector storage 304. By thismeans, a fixed excitation vector is generated (noise excitation vector).

By this selection and convolution of an optimum dispersion vector shapein accordance with the shape of an excitation vector, it is possible toimprove encoding performance compared to when a predetermined dispersionvector (one type or a plurality of types of fundamental dispersionvectors) is applied to all pulse excitation vectors.

Here, although the number of vector shapes memorized in a memory inpulse excitation vector shape identifier 302 is optional, by preparingadditional dispersion vectors only for those vectors of specific shapesof high frequency of use, it is possible to narrow the number ofadditional vectors and minimize increase in ROM capacity that resultsfrom introduction of additional dispersion vectors.

Now, a method of selecting an excitation vector of a specific shape ofhigh frequency of use that is memorized in advance in a memory of pulseexcitation vector shape identifier 302 and a method of selecting anadditional dispersion vector applied thereto will be described indetail.

FIG. 5A and FIG. 5B are drawings showing the distribution of thefrequency of use with respect to a pulse excitation vector (two pulses)output from pulse excitation codebook 301, based on the parameters ofthe distance between pulses and the polarity of each pulse, in whichseveral hours of actually encoded speech data is collected.

FIG. 5B is a drawing that enlarges FIG. 5A in the directions of thehorizontal axis. In FIGS. 5A and 5B, the horizontal axis indicates thedistance between pulses (samples), and the vertical axis indicates thenormalized frequency of use at which an excitation vector of a givendistance between pulses is used. Moreover, in FIG. 5A and FIG. 5B, theorigin, where two pulses overlap, indicates that the excitation vectorcontains one pulse, that the left side of the origin is combination ofheteropolar pulses, and that the right side is combination of homopolarpulses.

The normalized frequency of use refers to the value obtained by dividingthe number of times the pulse excitation vector of each interval is usedby the number of combination of pulses in each interval. For instance,when there are a number of combinations such as when the interval is 1sample and the first pulse is 1 sample and the second pulse is 2samples, 2 samples and 3 samples, and so on, the frequency is normalizedby the number of all the combinations that the pulse excitation codebookcan generate.

As obvious from FIG. 5A and FIG. 5B, regardless of combinations ofpolarities, the frequency of use concentrates on excitation vectorshaving less than three samples of distance between two pulses.

5 types of excitation vectors are selected here in which the distancebetween 2 pulses is less than three samples (Distance between pulses 0,distance between pulses 1 and homopolar pulses, distance between pulses1 and heteropolar pulses, distance between pulses 2 and homopolarpulses, distance between pulses 2 and heteropolar pulses) to be storedin a memory of pulse excitation vector shape identifier 302.

Next, for each excitation vector selected, a dedicated, additionaldispersion vector is designed through learning.

The learning of dispersion vectors is performed based on the generalizedLloyd algorithm, as shown in the part of 3.1 in K.Yasunaga et. al,“Dispersed-pulse codebook and its application to 4 kb/s speech coder,”Proc. ICASSP2000, pp.1503-1506, 2000, and dispersion vectors thatminimize the total of encoding distortion in comparison to learning dataare determined.

FIG. 6-FIG. 10 show examples of designed additional dispersion vectors,each showing a case where 4 types of additional dispersion vectors aredesigned for each excitation vector.

FIG. 6 shows that four types of dedicated dispersion vectors (A1-A4) areassigned to an excitation vector having two samples of distance betweenpulses and homopolar pulse polarities. Similarly, FIG. 7 shows that fourtypes of additional dispersion vectors (B1-B4) are provided for anexcitation vector having one sample of distance between pulses andhomopolar pulse polarities. Similarly, FIG. 8, FIG. 9, and FIG. 10 showthat four types of additional dispersion vectors are providedrespectively for excitation vectors having 0 sample of distance betweenpulses and homopolar, having 1 sample of distance between pulses andheteropolar, and having 2 samples of distance between pulses andheteropolar. As obvious from FIG. 6-FIG. 10, the shapes of theadditional dispersion vectors obtained in correspondence to the 5 typesof pulse excitation vectors have different features.

When learning is performed using common dispersion vectors for allexcitation vectors, a vector is obtained in an average shape of thesedispersion vectors having different features, which sets limits toimprovement of performance. An example of a fundamental dispersionvector is shown in FIG. 11.

Although with FIG. 6-FIG. 10 cases are explained on the premise thateach excitation vector is assigned 4 types of additional dispersionvectors, the present invention is by no means limited to this. Forinstance, the number (type) of additional dispersion vectors shown inFIG. 6-FIG. 10 can be one.

Moreover, although no drawing shows such, when there are 3 pulses, eachexcitation vector having a specific shape of high frequency of use isprovided with a unique additional dispersion vector.

FIG. 12 is a drawing showing the content of selection processing indispersion vector storage 304 where additional dispersion vectors areprovided as shown in FIG. 6-FIG. 10.

As shown in FIG. 12, dispersion vector storage 304 comprises a pluralityof dispersion vector subsets 400-405.

Dispersion vector subset 400, comprising terminal X0 that outputs afundamental dispersion vector, outputs the fundamental dispersion vectorto dispersion vector convolution processor 303 via switch 406.

Dispersion vector subset 401, comprising terminals A1-A4 that output thefour additional dispersion vectors shown in FIG. 6 and terminal A0 thatoutputs the fundamental dispersion vector, selects one dispersion vectordetermined by parameter determination section 212 from among 5 types ofdispersion vectors A0-A4 by means of switch 407 and outputs this todispersion vector convolution processor 303 via switch 406.

Similarly, dispersion vector subsets 402-405, comprising terminalsB1-B4, C1-C4, D1-D4, and E1-E4 that output the four additionaldispersion vectors shown in FIG. 7-FIG. 10, and terminals B0, C0, D0,and E0 that output the fundamental dispersion vector, respectively,select one dispersion vector determined in parameter determinationsection 212 by means of switches 408, 409, 410, 411, and output them todispersion vector convolution processor 303 via switch 406.

In FIG. 12, the fundamental vectors output from terminals X0, A0, B0,C0, D0, and E0 are identical.

Switch 406, which performs the switching of dispersion vector subsets400-405, switches in accordance with the shape of pulse excitationvectors output from pulse excitation codebook 301 and based on controlof pulse excitation vector shape identifier 302. That is, when a pulseexcitation vector of a specific shape of high frequency of use is inputfrom pulse excitation codebook 301 into pulse excitation vector shapeidentifier 302, switch 406 is connected to dispersion vector subsets401-405 corresponding to pulse excitation vectors of that shape. When apulse excitation vector of a non-specific shape is input from pulseexcitation codebook 301 into pulse excitation vector shape identifier302, switch 406 is connected to an output terminal of dispersion vectorsubset 400.

Switches 407-411 connect with terminals in dispersion vector subsets401-405 that output dispersion vectors determined in parameterdetermination section 212 from among 5 types of dispersion vectors.

According to the above configuration, when a excitation vector that isidentical to one memorized in pulse excitation vector shape identifier302 is output from pulse excitation codebook 301, the optimum one isselected from among 5 types including 4 types of additional dispersionvectors and a fundamental dispersion vector.

Referring to FIG. 12, although there are 5 dispersion vector subsetsprovided with additional dispersion vectors, the number of dispersionvector subsets is by no means limited by the present invention and canbe increased or decreased depending on the number of pulse excitationvector patterns with high frequency of use. Similarly, although eachdispersion vector subset is provided with 4 types of additionaldispersion vectors, the present invention sets no limit on the number ofadditional dispersion vectors.

FIG. 13 shows the steps of important parts of the above describedprocessing. FIG. 13 is a flowchart showing the processing flow of afixed excitation codebook search in FIG. 4.

First, in ST501, a pulse excitation search is performed using afundamental dispersion vector. An impulse may be used for thefundamental dispersion vector (that is, no dispersion). A specificsearch method is disclosed, for instance, in Laid-Open Japanese PatentApplication Publication No. HEI10-63300 (the 17th paragraph (“BackgroundArt”) and the 51st through 54th paragraphs), and in the part of 2.2 inK.Yasunaga et. al, “Dispersed-pulse codebook and its application to 4kb/s speech coder,” Proc. ICASSP2000, pp.1503-1506, 2000.

Next, in ST502, whether the pulse excitation vector selected in ST501has parameters (pulse positions and combination of signs) for apredetermined specific shape is checked.

These specific shapes refer to the shapes of those vectors, among pulseexcitation vectors generated from the pulse excitation codebook, thatare frequently used as a fixed excitation vector (selected as a resultof search).

That is, to be more specific, for instance, among 2-pulse excitations,vectors of high frequency of use refer to those that have the shape inwhich the distance between pulses is 1 (for instance, excitation pulsesoccur in the 11th sample and in the 12th sample) and the pulsepolarities have different polarities and the shape in which the distancebetween pulses is 2 samples (for instance, an excitation pulse occurs inthe 20th sample and in the 22nd sample) and the pulse polarities havethe same code.

When excitation vectors do not have these specific shapes, a pulseexcitation vector selected in ST501 is convoluted with a fundamentaldispersion vector and used as a fixed excitation vector.

That is, switch 406 of FIG. 12 is connected to terminal X0 of dispersionvector subset 400. If the pulse excitation vector selected in ST501 hasa specific shape, ST503 follows.

ST503 checks whether there are dispersion vectors, among the additionaldispersion vectors of dispersion vector subsets (dispersion vectorsubsets 401-405 of FIG. 12) provided dedicated to vectors of specificshapes, that make quantization error less than the fundamentaldispersion vector, and selects the dispersion vector that minimizesquantization error the most from the fundamental dispersion vector andthe additional dispersion vectors. A pulse excitation vector shapeidentifier 302 selects appropriate dispersion vector subset containingthe additional dispersion vectors.

The result of convoluting the pulse excitation vector selected in ST501and the dispersion vector selected in ST502 or in ST503 is determined asa fixed excitation code vector.

Such configuration, in which a number of dedicated additional dispersionvectors are provided only for pulse excitation vectors having specificshapes of high frequency of use, minimizes increase in the amount ofinformation and is more readily implementable, and there may be caseswhere a pulse excitation codebook (when the pulse excitation codebookhas codes that are not used) is implemented without increase in thenumber of bits.

Now, the encoding and decoding of a fixed excitation codebook generatedby the above method will be explained with a specific example. Forexample, a case will be described here where there are 2 pulses in 80samples. Each pulse can occur in any 1 sample of the 80 samples. The twopulses, referred to as pulse 1 and pulse 2, may even occur in 1 samplein an overlap. The pulse amplitude in this case is the amplitude ofpulse 1 and pulse 2 added, and if each pulse has the amplitude of 1,this will be one pulse with the amplitude of 2. When the 2 pulses occurin different samples, their combinations will be 80C2=3160 patterns. Thepolarity relationship of the two pulses are in 2 patterns ofhomopolarity and heteropolarity, and so the shape of a pulse excitationvector has 3160×2=6320 patterns. The 80 patterns of the case where twopulses overlap and become one are added thereto, and so there are total6400 patterns for the shape of a pulse excitation vector. Finally, thepolarity of the pulse excitation vector as a whole has two patterns, andso there are 6400×2=12800 patterns (<14 bits) Then, by representing thepolarity of pulse 1 by one bit, such that when pulse 1 is behind pulse 2the 2 pulses are heteropolar and when pulse 1 and pulse 2 are at thesame position or pulse 2 is ahead the 2 pulses are homopolar, it ispossible to express 12800 patterns of vectors with 14 bits.

Now, the method of representing the above fixed codebook in 14-bit codeswill be explained.

First, a pulse excitation search is performed, and the position and signof pulse 1 and pulse 2 are determined. Next, the spatial relationshipbetween pulse 1 and pulse 2 is checked. Now, if pulse 2 is behind pulse1, whether the polarity relationship between pulse 1 and pulse 2 isheteropolar is checked, and if it is not heteropolar, the positions ofpulse 1 and pulse 2 are swapped. On the other hand, when pulse 1 andpulse 2 are at the same position or pulse 2 is ahead, whether thepolarity relationship between pulse 1 and pulse 2 is homopolar ischecked, and, when it is not homopolar, the positions of pulse 1 andpulse 2 are swapped.

Pulse 1 and pulse 2 determined thus are encoded as follows. Assume thatthe 14 bits include 0-13 (bit 0 being the lowest bit). Bit 13 (═S),which is the highest bit, is the one bit that represents the sign ofpulse 1, which is 1 when positive and 0 when negative.

Next, the combination of the positions of the 2 pulses will be encoded.For example, assuming that the position of pulse 1 is p1 and theposition of pulse 2 is p2, code CF is encoded: CF=p1×80+p2. Acquiredthus, CF is 0-6399, represented with 13 bits of 0-12 (0-8191). As aresult, it is possible to assign fixed code vectors, to which additionaldispersion vectors are applied, to the remaining 6400-8191.

If 5 types of shapes of pulse excitation vectors in which:

(1) Distance between pulse 1 and pulse 2 is 2 samples, homopolar (78patterns);

(2) Distance between pulse 1 and pulse 2 is 1 sample, homopolar (79patterns);

(3) Distance between pulse 1 and pulse 2 is 0 sample, homopolar (80patterns);

(4) Distance between pulse 1 and pulse 2 is 1 sample, heteropolar (79patterns); and

(5) Distance between pulse 1 and pulse 2 is 2 samples, heteropolar (78patterns),

are each assigned 4 types of additional dispersion vectors, (1) is78×4=312 and can be assigned codes 6400-6711; (2) is 79×4=316 and can beassigned codes 6712-7027; (3) is 80×4=320 and can be assigned codes7028-7347; (4) is 79×4=316 and can be assigned codes 7348-7663; and (5)is 78×4=312 and can be assigned codes 7664-7975. To be specific, if thenumber of additional dispersion vectors selected by search processing isdv(=0-3), code CF is generated when a pulse excitation vector shapedeterminer determines on:CF=6400+78×dV+(p1−2), (2≦p1≦79);  (1)CF=6712+79×dV+(p1−1), (1≦p1≦79);  (2)CF=7028+80×dV+(p1), (0≦p1≦79);  (3)CF=7348+79×dV+(p1), (0≦p1≦78); and  (4)CF=7644+78×dV+(p1), (0≦p1≦77).  (5)

Finally the sign bit is attached to the top, and thus transmission codeF is generated (F=S×8192+CF)

The position p1 and sign s1 of pulse 1, the position p2 and sign s2 ofpulse 2, and applicable dispersion vector information are encoded.

Next, the decoding by a decoder that received transmission code F willbe explained. In the decoder, two pulse positions (p1, p2) and the signs(s1, s2) are decided in the following steps.

First, sign information S is decoded from received code F.S=((F>>13&1)×2−1 (S becomes −1 or +1)

Next, pulse position information code CF is decoded.CF=F&0×1 FFF

Next, depending on the value of CF, the processing will switch asfollows:

(1) CF is less than 6400p2=CF % 80, p1=(CF−p2)÷80s1=S, s2=−S(where p2>p1),=+S(where p2≦p1)

For the dispersion vector, the fundamental dispersion vector is used.

(2) CF is greater than or equal to 6400 and less than 6712p1=(CF−6400)% 78+2, p2=p1−2, s1=s2=S

The dvth additional dispersion vector of subset 1 (FIG. 6) is used.dv=((CF−6400)−(p1−2))÷78

(3) CF is greater than or equal to 6712 and less than 7028p1=(CF−6712)% 79+1, p2=p1−1, s1=s2=S

The dvth additional dispersion vector of subset 2 (FIG. 7) is used.dv=((CF−6712)−(p1−1))÷79

(4) CF is greater than or equal to 7028 and less than 7348p1=(CF−7028)% 80, p2=p1, s1=s2=S

The dvth additional dispersion vector of subset 3 (FIG. 8) is used.dv=((CF−7028)−p1)÷80

(5) CF is greater than or equal to 7348 and less than 7664p1=(CF−7348)% 79, p2=p1+1, s1=S, s2=−S

The dvth additional dispersion vector of subset 4 (FIG. 9) is used.dv=((CF−7348)−p1)÷79

(6) CF is greater than or equal to 7664 and less than 7975p1=(CF−7664)% 78, p2=p1+2, s1=S, s2=−S

The dvth additional dispersion vector of subset 5 (FIG. 10) is used.dv=((CF−7664)−p1)÷78

The position p1 and sign s1 of pulse 1, the position p2 and signs 2 ofpulse 2, and applicable dispersion vector information are decoded asabove.

FIG. 14 is a block diagram showing another configuration of a fixedsource codebook.

Fixed excitation codebook 207 of FIG. 14 comprises two fixed excitationcodebook subsets 608 and 609. First fixed excitation codebook subset 608comprises three blocks, namely first pulse excitation codebook 601,dispersion vector storage 602, and dispersion vector convolutionprocessor 603. First pulse excitation codebook 601 is an excitationcodebook that generates predetermined pulse excitation vectors (forexample, vectors composed of two pulses). Dispersion vector storage 602is a storage that stores the dispersion vectors designed dedicated tofirst pulse excitation codebook 601. Dispersion vector convolutionprocessor 603 is a convolution processor that convolutes a dispersionvector output from dispersion vector storage 602 in a pulse excitationvector output from first pulse excitation codebook 601.

Similarly, second fixed excitation codebook subset 609 comprises threeblocks, namely second pulse excitation codebook 604 (for instance,second pulse excitation codebook 604 is different from first pulseexcitation codebook 601, and generates pulse excitation vectors composedof 3 or 5 pulses), dispersion vector storage 605, and dispersion vectorconvolution processor 606.

Now, the dispersion vector storages inside the fixed source codebooksubsets are designed respectively dedicated to the pulse excitationcodebooks of the subsets.

Although a case was described with the present embodiment where thenumber of subsets in a fixed excitation codebook is 2, the presentinvention sets no limit on the number, and even when the number is 3 ormore, the same effect can still be achieved.

Moreover, the pulse excitation codebooks in the respective subsets maybe different in the number of excitation pulses included in anexcitation vector or in the patterns of excitation pulses (for example,one excitation pulse codebook generates only the combinations ofclose-positioned pulses, while the other excitation pulse codebookgenerates the combinations of separate-positioned pulses).

In any way, generating excitation vectors of different features andcharacteristics on a per subset basis heightens the degree ofperformance improvement. Switch 607 selects one of the fixed excitationvectors output from dispersion vector convolution processor 603 and fromdispersion vector convolution processor 606.

This fixed source codebook generates a fixed excitation vector specifiedby signal (F) input from parameter determination section 212 by means offirst fixed excitation codebook subset 608 or second fixed excitationcodebook subset 609, and outputs the result as a fixed excitation vectorvia switch 607.

FIG. 15 is a flowchart showing the processing steps of searching thefixed excitation codebook of FIG. 14.

First, in ST701, the first fixed codebook subset is searched, and afixed excitation vector that minimizes quantization error is selected.

Next, in ST702, the second fixed codebook subset is searched, and, ifthere is a fixed excitation vector that minimizes quantization errormore than the fixed excitation vector selected in ST701, this isselected as the final fixed excitation vector.

ST701 and ST702 are different only in that different dispersion vectorsare applied to different fixed codebooks. The different fixed excitationcodebooks are provided such that excitation code vectors generatedrespectively have different characteristics (different numbers of sourcepulses, for instance).

The fixed excitation codebook subsets may be provided with differentnumbers of excitation pulses, such that the first fixed excitationcodebook subset generates excitation vectors composed of two excitationpulses and the second fixed excitation codebook subset generates fixedexcitation vectors composed of five excitation pulses. Moreover, fixedexcitation codebook subsets of different combinations of excitationpulses may be provided, such that the first fixed codebook subsetgenerates fixed excitation vectors of combinations of close-positionedpulses and the second fixed excitation codebook subset generates fixedexcitation vectors in which a number of excitation pulses are diffusedand placed over the whole vector (for example, even though the firstfixed excitation codebook subset and the second fixed excitationcodebook subset generate excitation vectors composed of the same numberof pulses, the first fixed excitation codebook subset generates fixedexcitation codebook vectors in which all pulses are placed within therange of a predetermined number of samples, M (for instance, 2-10samples), while the second fixed excitation codebook subset generatesfixed excitation vectors in which the intervals of all excitation pulsesare above a predetermined number of samples, M′ (for instance, 10samples).

As described above, by applying dedicated dispersion vectors toexcitation vectors of specific shapes of high frequency of use, it ispossible to effectively improve the quality of decoded speech. Moreover,by applying different dispersion vectors depending on thecharacteristics of pulse excitation vectors, it is possible toeffectively improve the quality of decoded speech.

Incidentally, as long as the configuration is such that a number ofdedicated dispersion vectors are provided only for pulse excitationvectors of specific shapes with high frequency of use, increase ordecrease in the number of dispersion vector patterns is of minorsignificance, and likewise the trouble of designing dispersion vectorpatterns is of minor significance.

On the other hand, the quality of decoded speech can be improved veryeffectively and efficiently. That is, providing many dispersion vectorsthat contribute little to actual sound quality improvement ismeaningless processing, and yet according to the present invention, byadding a small number of dedicated dispersion patterns (additionaldispersion vectors), it is possible to efficiently achieve the effect ofimproving sound quality.

The above described fixed excitation codebook can be implemented bymeans of hardware, and it is also possible to store necessary vectordata in database and, using this data, generate waveform data of fixedexcitation vectors by means of software.

Second Embodiment

A digital filter with high-frequency emphasis function is conventionallyprovided in a part after a synthesis filter where signal processing isperformed, and, generally, this filter is a high-pass filter representedby means of a one-dimensional digital filter, which is disclosed, forexample, in J-H. Chen and A. Gersho, “Adaptive Postfiltering for QualityEnhancement of Coded Speech”, IEEE Trans. Speech & Audio Processing,Vol. 3, No. 1, January 1995.

In contrast, the present embodiment is characterized in that, at thespeech decoding end, unique high-frequency emphasis processing isapplied to signals before a synthesis filter.

FIG. 16 is a block diagram showing a configuration of speech decoder 111of FIG. 2.

Referring to FIG. 16, multiplex separation section 801 separates codedinformation output from RF demodulator 110, which is multiplex codedinformation, into individual code information. Separated LPC code (L) isoutput to LPC decoding section 802, separated adaptive excitation vectorcode (A) is output to adaptive excitation codebook 805, separatedexcitation gain code (G) is output to quantization gain generationsection 806, and separated fixed excitation vector code (F) is output tofixed excitation codebook 807.

LPC decoding section 802 decodes an LPC from code (L) output frommultiplex separation section 801, and outputs it to synthesis filter803. Adaptive excitation codebook 805 takes one frame of samples as anadaptive excitation vector from the past drive excitation signal samplesspecified by code (A) output from multiplex separation section 801, andoutputs it to multiplier 808.

Quantization gain generation section 806 decodes an adaptive excitationvector gain and a fixed excitation vector gain specified by excitationgain code (G) output from multiplex separation section 801, and outputthem to multiplier 808 and multiplier 809.

Fixed excitation codebook 807, generates a fixed excitation vectorspecified by code (F) output from multiplex separation section 801, andoutputs it to multiplier 809.

Multiplier 808 multiplies the adaptive excitation vector by the aboveadaptive excitation vector gain, and outputs the result to adder 810.Multiplier 809 multiplies the fixed excitation vector by the fixedexcitation vector gain, and outputs the result to adder 810.

Adder 810 performs addition of the adaptive excitation vector and thefixed excitation vector output from multipliers 808 and 809 after gainmultiplication, generates a drive excitation vector, and outputs it tohigh-frequency emphasis section 811.

High-frequency emphasis section 811 (high-frequency emphasis postfilter)applies unique high-frequency emphasis processing to the driveexcitation vector (for example, high-frequency emphasis processing isperformed such that the degree of amplitude emphasis is higher forcomponents of higher frequency) and outputs the signal afterhigh-frequency emphasis to synthesis filter 803. The detail ofhigh-frequency emphasis section 811 will be explained later.

Synthesis filter 803 performs filter synthesis of the excitation vectoroutput from high-frequency emphasis section 811 as a drive signal usinga filter coefficient decoded by LPC decoding section 802, and outputsthe reconstructed signal to post-processing section 804.

Post-processing section 804 performs processings such as formantemphasis and pitch emphasis that improve the subjective quality ofspeech, and processings that improve the subjective quality ofenvironmental noise, and thereafter outputs the final decoded speechsignal to D/A converter 112.

Next, high-frequency emphasis processing will be described in detailwith reference to FIG. 17.

Generally, in CELP encoding, a high component of a decoded signal tendsto weaken. This tendency intensifies especially at low bit rates, and soby emphasizing the high component of a decoded signal, it is possible toimprove the subjective quality to a certain degree.

In high-frequency emphasis section 811 (high-frequency emphasispostfilter) of FIG. 17, an excitation vector is input to high-passfilter 901 (HPF) adder 902, and adder 903.

High-pass filter 901 does the job of extracting a high-frequencycomponent that needs to be amplified. A component of a drive excitationvector corresponding to higher frequency than the cutoff frequency ofhigh-pass filter 901 is output to adder 903, log power calculator 904,and multiplier 906.

Adder 903 subtracts the high component of the excitation vector from theexcitation vector, and outputs the result to log power calculator 905.

Log power calculator 904 calculates the log power of the high componentof the excitation vector and outputs the result to power ratiocalculator 907. Log power calculator 905 calculates the log power of thesignal, which is the excitation vector minus the high component, andoutputs the result to power ratio calculator 907.

Power ratio calculator 907 calculates the log power ratio between thehigh component and the other components of the excitation vector, andoutputs the result to emphasis coefficient calculator 908.

Emphasis coefficient calculator 908 calculates the coefficient (emphasiscoefficient Rr) to multiply the high component of the excitation vectorby, such that the log power ratio becomes basically constant.

To be more specific, where a signal output from log power calculator 904is Eh[i], a signal output from log power calculator 905 is El[i], and Lindicates the subframe length, log power ratio R output from log powercalculator 905 can be expressed by the following equation:R=log 10(ΣEl[i])−log 10(ΣEh[i])(i=0, 1, . . . L−1)  (1)

Then, to make this log power ratio R at constant value Cr (0.42, forinstance), emphasis coefficient calculator 908 obtains coefficient Rr asthe ratio between Cr and R (log power ratio) by the following equation(2):Rr=R−Cr  (2)

Limiter 909 sets a lower limit value (for instance, 0)and an upper limitvalue (for instance, 0.3) of coefficient Rr, making coefficient Rr theupper limit value when the value of coefficient Rr calculated byemphasis coefficient calculator 908 is larger than the upper limitvalue, and making coefficient Rr the lower limit value when the value ofcoefficient Rr is less than the lower limit value.

Smoothing circuit 910 smoothes the values of emphasis coefficient Rrwith time (between samples and/or between subframes) such that the valueof emphasis coefficient Rr changes smoothly between subframes andbetween samples.

To be more specific, first, as indicated by the following equation (3),the log power ratio is converted to a linear domain and subtracted by 1.This is to add only the portion above 1.0 to the original source signal(from 810) from which the high component is not subtracted.Rrl=pow(10., Rr)−1  (3)

Then, smoothing is performed such that Rrl changes smoothly between(sub) frames. The smoothing coefficient α is set so as not to make thesmoothing excessively strong (for instance, α=0.3)Rrl′=α×Rrl′+(1−α)×Rrl  (4)

Moreover, when this emphasis coefficient Rrl′ after smoothing ismultiplied by output signal exh[i] from high-pass filter 901 and addedto excitation vector ex[i], by the following equation (5), Rrl′ issmoothed on a per sample basis and made Rrl″. This smoothing processingis relatively strong (for instance, β=0.9)

for(i=0;i<L;i++) { Rrl″=β×Rrl″ + (1−β)×Rrl′; exn[i]=ex[i]+Rrl″×exh[i]; }

Multiplier 906 multiplies high component exh[i] of the excitation vectoroutput from high-pass filter 901 by emphasis coefficient Rrl″ smoothedin smoothing circuit 910.

Adder 902 adds high component signal Rrl″×exh[i] multiplied by thesmoothed coefficient to excitation vector exn[i], and outputs the resultto synthesis filter 803.

Above exn[i] can be directly output to synthesis filter 803, and yet itis more common to perform scaling processing so as to give the samepower as original excitation vector ex[i]. Such scaling processing maybe performed after adder 902, or above Rrl″ maybe calculated inconsideration of scaling processing. In the latter case, an input linefrom high-pass filter 901 to smoothing circuit 910 is necessary. In theformer case, a scaling processing section enters between adder 902 andsynthesis filter 803, and an excitation vector (from adder 810) and theexcitation vector after high-frequency emphasis (from adder 902) isinput into the scaling processing section.

The processing in detail is as follows:

(when performed after adder 902) Ene_ex =Σ(ex[i]×ex[i])   (i=0,1,...L−1)Ene_exn=Σ(exn[i]×exn[i]) Scl=√(Ene_ex/Ene_exn) for(i=0;i<L;i++){   Scl′=β×Scl′ + (1−β)×Scl;    exn[i]=exn[i]×Scl′; } (when scalingprocessing is included in Rrl″) Ene_ex=Σ(ex[i]×ex[i]),    (i=0,1,...L−1) Ene_exn = Σ((Rrl′×exh[i] +ex[i])×(Rrl′×exh[i] + ex[i])) Scl=√(Ene_ex/Ene_exn) for(i=0;i<L;i++){   Rrl″=β×Rrl″ + (1−β)×Scl;    exn[i]=Rrl″×(Rrl′×exh[i]+ex[i]); }

The characteristics of high-pass filter 901 are adjusted so as tooptimize the subjective quality of decoded speech signals. To be morespecific, a two-dimensional IIR filter that makes the cutoff frequencyapproximately 3 kHz when the sampling frequency is 8 kHz is preferable.In addition, according to the present embodiment, the cutoff frequencycan be designed freely so as to be suitable for the speech signalencoding characteristics of the encoder. Moreover, the degree for theabove high-pass filter can be designed freely as well so as to have thedesired filter characteristics and to meet a requirement of the amountof computation that can be tolerated.

By thus performing high-frequency emphasis processing by means of adigital filter with unique transfer function, it is possible tocompensate gain reduction of an excitation signal in high-frequencyranges and implement flat characteristics, so that unique filtercharacteristics effective for auditory enhancement can be implemented,thereby enabling effective improvement of the quality of decoded speech.For instance, by performing high-frequency emphasis, it is possible toprevent decoded speech from gaining a muffled subjective quality.

Moreover, the high-frequency emphasis postfilter can be readily providedbefore a synthesis filter, and the present invention can be readilyapplied to actual products.

As described above, the present invention enables efficient enhancementof the quality of decoded speech by adding minimum hardware. The presentinvention also enables performance improvement of a fixed excitationcodebook that has pulse dispersion configurations. Moreover, it ispossible to effectively compensate the high attenuation of excitationvectors in CELP encoding and improve the subjective quality.

The fixed vector generation method, CELP type speech encoding method,and the CELP type speech decoding method of the present invention can beimplemented by installing a program through communication channels orfrom a CD or other memory mediums and executing it by means ofcontrolling means such as CPU.

The present application is based on Japanese Patent ApplicationNo.2002-043878, filed on Feb. 20, 2002, entire content of which isexpressly incorporated herein by reference.

INDUSTRIAL APPLICABILITY

The present invention is suitable for use in a CELP type speech encoderor a CELP type speech decoder.

1. A CELP type speech decoder that receives an excitation gain code, anadaptive excitation vector code, and a fixed excitation vector codeassociated with encoded speech transmitted from a CELP type speechencoder and decodes the encoded speech, said CELP type speech decodercomprising: a quantized gain generating section that receives theexcitation gain code from the CELP type speech encoder and decodes anadaptive excitation vector gain and a fixed excitation vector gainspecified by the excitation gain code; an adaptive excitation codebookthat receives the adaptive excitation vector code from the CELP typespeech encoder and takes one frame of samples as an adaptive excitationvector from past excitation signal samples specified by the adaptiveexcitation vector code; a fixed excitation codebook that receives thefixed excitation vector code from the CELP type speech encoder andgenerates a fixed excitation vector specified by the fixed excitationvector code; an excitation vector generating section that generates anexcitation vector by adding a vector obtained by multiplying theadaptive excitation vector gain and the adaptive excitation vector, anda vector obtained by multiplying the fixed excitation vector gain andthe fixed excitation vector; a high-frequency emphasis section thatperforms high-frequency emphasis processing on the excitation vectorgenerated by the excitation vector generating section; and a synthesisfilter that performs filter synthesis of the excitation vector outputfrom the high-frequency emphasis section employing a set of filtercoefficients to output decoded speech data, wherein said fixedexcitation codebook comprises: a comparing section that compares theshape of a pulse excitation vector with predetermined shapes todetermine a predetermined shape which matches the shape of said pulseexcitation vector; a storing section that stores sets of dispersionvectors that are designed exclusively for each of said predeterminedshapes; a selecting section that selects a set of said dispersionvectors that are associated with the predetermined shape which matchesthe shape of said pulse excitation vector; and a convolving section thatconvolves said pulse excitation vector with one of the dispersionvectors in the selected set to obtain the fixed excitation vector.
 2. ACELP type speech decoder that receives an excitation gain code, anadaptive excitation vector code, and a fixed excitation vector codeassociated with encoded speech transmitted from a CELP type speechencoder and decodes the encoded speech, said CELP type speech decodercomprising: a quantized gain generating section that receives theexcitation gain code from the CELP type speech encoder and decodes anadaptive excitation vector gain and a fixed excitation vector gainspecified by the excitation gain code; an adaptive excitation codebookthat receives the adaptive excitation vector code from the CELP typespeech encoder and takes one frame of samples as an adaptive excitationvector from past excitation signal samples specified by the adaptiveexcitation vector code; a fixed excitation codebook that that receivesthe fixed excitation vector code from the CELP type speech encoder andgenerates a fixed excitation vector specified by the fixed excitationvector code; an excitation vector generating section that generates anexcitation vector by adding a vector obtained by multiplying theadaptive excitation vector gain and the adaptive excitation vector, anda vector obtained by multiplying the fixed excitation vector gain andthe fixed excitation vector; a high-frequency emphasis section thatperforms high-frequency emphasis processing on the excitation vectorgenerated by said excitation vector generating section; and a synthesisfilter that performs filter synthesis of the excitation vector outputfrom the high-frequency emphasis section employing a set of filtercoefficients to output decoded speech data, wherein the high frequencyemphasis section comprises: a high pass filter that receives theexcitation vector generated by said excitation vector generating sectionand allows a high-frequency component of the excitation vector generatedby said excitation vector generating section to pass; a first log powercalculator that calculates a log power of the excitation vector that haspassed through the high pass filter; an adder that performs processingthat subtracts the excitation vector that has passed through the highpass filter from the excitation vector generated by said excitationvector generating section without passing through the high pass filter;a second log power calculator that calculates the log power of theexcitation vector output from the adder, from which the high frequencycomponent is removed; a power ratio calculator that calculates a ratiobetween the log powers calculated by the first and second log powercalculators; and a coefficient calculator that calculates a value of anemphasis coefficient for multiplying the high frequency component of theexcitation vector generated by said excitation vector generating sectionthat causes the ratio between the log powers to be basically a constantvalue, wherein: the high-frequency emphasis section performshigh-frequency emphasis processing by multiplying a signal componentthat has passed through the high pass filter by the emphasis coefficientcalculated by the coefficient calculator and adding a result thereof tothe excitation vector generated by said excitation vector generatingsection, to obtain an addition result for outputting to the synthesisfilter.